Measuring the likelihood property of scoring functions in general retrieval models
نویسندگان
چکیده
Although retrieval systems based on probabilistic models will rank the objects (e.g. documents) being retrieved according to the probability of some matching criterion (e.g. relevance) they rarely yield an actual probability and the scoring function is interpreted to be purely ordinal within a given retrieval task. In this paper it is shown that some scoring functions possess the likelihood property, which means that the scoring function indicates the likelihood of matching when compared to other retrieval tasks which is potentially more useful than pure tanking although it cannot be interpreted as an actual probability. This property can be detected by using two modified effectiveness measure, entire precision and entire recall. Empirical evidence is offered to show the existence of this property both for traditional document retrieval and for analysis of crime data where suspects of an unsolved crime are ranked according to probability of culpability.
منابع مشابه
Nonparametric IRT: Scoring functions and ordinal parameter estimation of isotonic probabilistic models (ISOP)
The most popular unidimensional psychological test evaluation rule is the trivial scoring function: To the answers of each item of a test item scores of 0,1,...m points are awarded and the unweighted or simple or total sum of the item scores gives the test score (also: Likert score). This total score has desirable stochastic ordering properties like monotone likelihood ratio (MLR), stochastic o...
متن کاملکاربست مدل بازیابی تخصص برای یافتن نویسندگان خبره
This research applied Expertise Retrieval model for finding expert authors, and used evaluation methods of Information Retrieval systems for measuring the performance of those models. Current research is an experimental one. Besides, a variety of methods including survey method has been used in the research process. Various models were developed for finding expert authors, all built on a known ...
متن کاملThe Smoothed Dirichlet Distribution: Understanding Cross-entropy Ranking in Information Retrieval
THE SMOOTHED DIRICHLET DISTRIBUTION: UNDERSTANDING CROSS-ENTROPY RANKING IN INFORMATION RETRIEVAL SEPTEMBER 2006 RAMESH M. NALLAPATI B.Tech., INDIAN INSTITUTE OF TECHNOLOGY, BOMBAY M.S., UNIVERSITY OF MASSACHUSETTS AMHERST M.S., UNIVERSITY OF MASSACHUSETTS AMHERST Ph.D., UNIVERSITY OF MASSACHUSETTS AMHERST Directed by: Prof. James Allan Unigram Language modeling is a successful probabilistic fr...
متن کاملDeveloping a BIM-based Spatial Ontology for Semantic Querying of 3D Property Information
With the growing dominance of complex and multi-level urban structures, current cadastral systems, which are often developed based on 2D representations, are not capable of providing unambiguous spatial information about urban properties. Therefore, the concept of 3D cadastre is proposed to support 3D digital representation of land and properties and facilitate the communication of legal owners...
متن کاملEvaluation of estimation methods for parameters of the probability functions in tree diameter distribution modeling
One of the most commonly used statistical models for characterizing the variations of tree diameter at breast height is Weibull distribution. The usual approach for estimating parameters of a statistical model is the maximum likelihood estimation (likelihood method). Usually, this works based on iterative algorithms such as Newton-Raphson. However, the efficiency of the likelihood method is not...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- JASIST
دوره 60 شماره
صفحات -
تاریخ انتشار 2009